Prosody boundary detection through context-dependent position models

نویسندگان

  • Yue-Ning Hu
  • Min Chu
  • Chao Huang
  • Yan-Ning Zhang
چکیده

In this paper, we propose to convert the prosody boundary detection task into a syllable position labeling task. In order to detect both prosodic word and prosodic phrase boundaries, 6 types of syllable positions are defined. For each position, context-dependent position models are trained from manually labeled data. These models are used to label syllable positions in unseen speech. Word and phrase boundaries are then easily derived from syllable position labels. The proposed approach is tested with a large scale single speaker database. The precision and recall for word boundary are 96.1% and 90.1%, respectively, and for phrase boundary are 83.7% and 80.5%, respectively. Results of a listening test shows that only 28% of word boundaries and 50% of phrase of boundaries detected automatically are critical error, implying only about 2.2% and 10% errors for word and phrase boundaries, respectively. The results are rather good, especially when it is considered that only acoustic features are used in this work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Intonational Phrase Boundary and Pitch Accent Dependent Speech Recognizer

Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. We describe the idea of prosody dependent speech recognition by building a prosody dependent speech recognizer that conditions word and phoneme models on two important prosodic variables: intonational phrase bou...

متن کامل

Automatic labeling of prosody

The paper proposes a framework for automatic prosody labeling. The labeling involves detection of the location of accented syllables and phrase boundaries, and recognition of pitch accent and boundary tone types. A number of classification models are designed to perform these tasks on the basis of small vectors of acoustic features. The models achieve high accuracy and their performance is comp...

متن کامل

Prosody-dependent Acoustic Modeling for Mandarin Speech Recognition

A study on introducing prosodic information to acoustic modeling (AM) for speech recognition is reported in this paper. It extends the conventional context-dependent (CD) triphone HMM modeling approach to further consider the dependency of phone model on the break type of nearby inter-syllable boundary. Four break types are considered, including major break, minor break, normal non-break, and t...

متن کامل

Improving the Robustness of Prosody Dependent Language Modeling Based on Prosody Syntax Dependence

This paper presents a novel approach that improves the robustness of prosody dependent language modeling by leveraging the dependence between prosody and syntax. A prosody dependent language model describes the joint probability distribution of concurrent word and prosody sequences and can be used to provide prior language constraints in a prosody dependent speech recognizer. Robust Maximum Lik...

متن کامل

Acoustic Cues for Automatic Determination of Phrasing

This paper proposes a framework of automatic determination of phrasing using acoustic features derived from the speech signal. The feature vectors were defined in a series of analyses investigating the acoustic-phonetic realization of minor and major phrase boundaries and different boundary types. The resulting representation was used to train statistical classifiers to automatically determine ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008